CLAIMS 

1 1. A method for efficiently parsing input data, comprising: 

2 receiving a data file; 

3 retrieving a stored version of the data file and a template/ token tree 

4 corresponding to the data file, the tree including at least one static node; 

5 comparing the stored version of the data file with the received data file to 

6 identify non-matching content in the received data file; 

7 parsing only the non-matching content to form subtrees; 

8 creating a mapping from the template/ token tree to the subtrees. 

1 2. The method of claim 1 wherein the step of creating the mapping from the tree 

2 to the subtrees further comprises: 

3 replacing at least one static node of the template/ token tree with a token; and 

4 creating a mapping from each token to at least one subtree. 

1 3. The method of claim 1 wherein creating the mapping from the tree to the 

2 subtrees further comprises: 

3 adding at least one token node to the template/ token tree; and 

4 creating a mapping from each token to at least one subtree. 

l 4. The method of claim 1 wherein the data file is a web page. 

1 5. The method of claim 1 wherein the data file is an HTML file. 

1 6. A method for efficiently parsing web pages, comprising: 

2 receiving a first HTML page; 
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3 retrieving a cached version of the HTML page and a template/ token tree 

4 corresponding to the first HTML page, the tree including at least one static 

5 node; 

6 comparing the cached version of the HTML page with the received HTML page 

7 to identify non-matching content in the received HTML page; 

8 parsing only the non-matching content to form at least one subtree; 

9 creating a mapping from the template/ token tree to the subtrees. 

1 7. The method of claim 6 wherein creating the mapping from the tree to the 

2 subtrees further comprises: 

3 replacing at least one static node of the template/ token tree with a token; and 

4 creating a mapping from each token to at least one subtree. 

1 8. A method for efficiently parsing HTML pages, comprising: 

2 receiving a first HTML page; 

3 responsive to a determination that a cached version of the HTML page exists: 

4 retrieving the cached version of the HTML page and a first 

5 template/ token tree corresponding to the first HTML page, the 

6 first tree including at least one static node; 

7 comparing the cached version of the first HTML page with the 

8 received HTML page to identify non-matching content; 

9 parsing only the non-matching content to form a subtree; 

10 associating the first tree and the subtree; 

11 responsive to a determination that the cached version of the HTML page does 

12 not exist: 

13 parsing the received HTML page to form a second template/ token 

14 tree, the second tree containing at least one static node; and 
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storing the second tree and the received HTML page. 



9. A method for providing derivative services comprising: 
receiving a first HTML page; 

constructing a template/ token tree from the received HTML page, the tree 

comprising a plurality of nodes; 
determining that at least one node of the tree contains static content; 
determining that at least one node of the tree contains dynamic content; 
replacing the nodes of the tree containing dynamic content with tokens; 
parsing the dynamic content to form subtrees; and 
mapping the tokens to the subtrees. 

10. A method of providing derivative services, comprising: 
receiving a request for derivative services content from a customer; 
retrieving data from a plurality of primary service providers on behalf of the 

customer, by: 



identifying static content that has been previously retrieved from the 



template/ token trees that have also been stored; 
identifying dynamic content that differs from the previously retrieved 
content; 



creating at least one content page comprising the retrieved data; and 
providing the created pages to the customer. 



primary service providers and stored, and corresponding 



parsing the dynamic content to form subtrees; 



adding tokens to the template/ token trees; 



mapping the tokens to the subtrees; 
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11. A method for efficiently parsing input data, comprising: 
receiving a first data file; 

retrieving a stored template/ token tree, the stored template/ token tree having 
content associated with the first data file and containing at least one static 
node and at least one token; 

retrieving a second data file, the second data file associated with the first data 
file; 

identifying non-matching content present only in the first data file; 
parsing only the non-matching content of the first data file to form at least one 
subtree; and 

mapping at least one of the tokens to at least one of the subtrees. 

12. The method of claim 11, further comprising: 

responsive to identifying non-matching content present only in the first file: 
adding at least one new token to the template/ token tree. 

13. A system for efficiently parsing input data, comprising: 

at least one virtual browser for retrieving content from primary content servers; 
an identification engine, communicatively coupled to the virtual browser for 

identifying retrieved content; 
a cache, communicatively coupled to the virtual browser and the parsing engine, 

for storing retrieved content and template/ token trees; 
a comparison engine, coupled to the virtual browser for comparing retrieved 

content with stored content to identify differing content not stored in the 

cache; 
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10 a parsing engine, communicatively coupled to the virtual browser, for parsing 

11 content identified by the comparison engine as differing content and forming 

12 subtrees from the content; and 

23 a content server, coupled to the virtual browser. 

1 14. The system of claim 13, further comprising a token master, coupled to the 

2 cache, for allocating new tokens to the virtual browser. 
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